The universality of simple distributional methods: Identifying syntactic categories in Mandarin Chinese
نویسندگان
چکیده
The problem of acquiring language is a difficult and complex one. The traditional assumption is that infants acquire language by virtue of innate knowledge, which reduces the problem to one of tuning this innate knowledge of language in general, to the specific characteristics of the language spoken in the infant’s early environment (e.g. Chomsky, 1980). Nevertheless, without empirical assessment of potential learning mechanisms and sources of information, it may be premature to decide which aspects of language, if any, require drawing on innate knowledge. This paper considers a simple distributional learning mechanism, which does not draw on explicit prior knowledge. This method has previously been shown to be informative about the syntactic category membership of individual words in English, French, and German. We ask whether it can provide similar constraints in Chinese.
منابع مشابه
Comparative Study in Mandarin Square badge designs between Ilkhanid and Timurid garments with Yuan and Ming Chinese garments
With the conquest of China and Iran by the Mongols, the influence of Chinese styles and methods appeared in all the visual arts, including the patterns of fabrics. These designs were also used on the clothes of those in power, which was of special importance in different periods and was considered a royal emblem. Mandarin square is one of the royal symbols. This Chinese royal emblem was also us...
متن کاملDistributional Information: A Powerful Cue for Acquiring Syntactic Categories
Many theorists have dismissed a priori the idea that distributional information could play a significant role in syntactic category acquisition. We demonstrate empirically that such information provides a powerful cue to syntactic category membership, which can be exploited by a variety of simple, psychologically plausible mechanisms. We present a range of results using a large corpus of child-...
متن کاملTransitivity in Light Verb Variations in Mandarin Chinese - A Comparable Corpus-based Statistical Approach
This paper adopts a comparable corpus-based approach to light verb variations in two varieties of Mandarin Chinese and proposes a transitivity (Hopper and Thompson 1980) based theoretical account. Light verbs are highly grammaticalized and lack strong collocation restrictions; hence it has been a challenge to empirical accounts. It is even more challenging to consider their variations between d...
متن کاملThe Bracketing Guidelines for the Penn Chinese Treebank (3.0)
This document describes the bracketing guidelines for the Penn Chinese Treebank Project. The goal of the project is the creation of a 100-thousand-word corpus of Mandarin Chinese text with syntactic bracketing. The Chinese Treebank has been released via the Linguistic Data Consortium (LDC) and is available to the public. This document can be divided into six parts. Section I discusses six funda...
متن کاملTowards a Model of Prediction-based Syntactic Category Acquisition: First Steps with Word Embeddings
We present a prototype model, based on a combination of count-based distributional semantics and prediction-based neural word embeddings, which learns about syntactic categories as a function of (1) writing contextual, phonological, and lexical-stress-related information to memory and (2) predicting upcoming context words based on memorized information. The system is a first step towards utiliz...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1995